Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-4614] Add support for tracing through tempo-k8s #149

Merged
merged 7 commits into from
Jun 13, 2024

Conversation

shayancanonical
Copy link
Contributor

Issue

We would like to be able to instrument our charm and generate traces to discover inefficiencies in charm code execution. The observability team has developed a charm for tempo that we would like to integrate with.

Solution

Integrate with the tempo charm

Considerations

  1. We are not supporting sending traces to tempo-k8s using HTTPS are there are still some rough edges to be rounded with this integration
  2. We need to use the edge version of cos-lite until they release to stable as it contains some critical fixes

image

Copy link
Contributor

@carlcsaposs-canonical carlcsaposs-canonical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

questions:

  • does the decorator from charm_tracing make debugging the charm with pdb more difficult?
  • any significant performance penalty or decreased resilience in case of network failures (e.g. if tracing code has bug and raises exception, will that affect HA/upgrades/rollbacks)?
  • any issues if some units of the charm have a different version of the tracing code lib (e.g. while an upgrade is in-progress)?

@shayancanonical
Copy link
Contributor Author

  • debugging with pdb would be a bit more complex with the decorator as there will be additional function calls on the stack than just the charm code -- the decorator overwrites how the charm is invoked, injecting code to instrument a function call before and after an event handler/extra type method is called. i believe you will still be able to set breakpoints and jump to them. i would suspect that stepping through code may be a bit more complex
  • if there's a relation with tempo, then yes, there will be decreased resiliency when there's an error + there will be a performance penalty. however, tempo is not meant to be used in production -- it is meant to be used locally to determine why some hook may take a long time to run
  • i see no issues if the charm code changes. if it changes to not have the decorator, we wont instrument. we are instrumenting units independently, so no cluster implications

Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thank you!
Please prepare the documentation post how to enable it (as a new document):
https://charmhub.io/mysql-router/docs/h-enable-monitoring?channel=dpe/edge

@carlcsaposs-canonical
Copy link
Contributor

tempo is not meant to be used in production -- it is meant to be used locally to determine why some hook may take a long time to run

when I discussed this with @PietroPasotti a few months ago, my understanding was that this was meant to be used in production—has that changed?

if it's not meant to be used in production, should we merge to main?

@shayancanonical
Copy link
Contributor Author

charm tracing is not expected to be enabled in production as i understand it, but i still think that there is value in pulling a charm from one of our tracks in a non-prod env and instrument tracing. the tracing integration is an optional one, and does not add any issues if not related to tempo

Copy link
Contributor

@paulomach paulomach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good



@trace_charm(
tracing_endpoint="tracing_endpoint",
extra_types=(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the criteria for choosing what goes in extra_types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any extra "class" that contains logic that we would like to instrument needs to go into extra_types. if there's a class that is not the charm or in extra_types, when the code in that class is run, the trace will have an empty pocket as it will not be instrumented

@shayancanonical shayancanonical changed the title Add support for tracing through tempo-k8s [DPE-4614] Add support for tracing through tempo-k8s Jun 12, 2024
@shayancanonical shayancanonical merged commit 7c24a35 into main Jun 13, 2024
34 checks passed
@shayancanonical shayancanonical deleted the feature/tempo_tracing branch June 13, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants